Optimal Dynamic Strings
نویسندگان
چکیده
In this paper we study the fundamental problem of maintaining a dynamic collection of strings under the following operations: • concat – concatenates two strings, • split – splits a string into two at a given position, • compare – finds the lexicographical order (less, equal, greater) between two strings, • LCP – calculates the longest common prefix of two strings. We present an efficient data structure for this problem, where an update requires only O(log n) worst-case time with high probability, with n being the total length of all strings in the collection, and a query takes constant worst-case time. On the lower bound side, we prove that even if the only possible query is checking equality of two strings, either updates or queries take amortized Ω(log n) time; hence our implementation is optimal. Such operations can be used as a basic building block to solve other string problems. We provide two examples. First, we can augment our data structure to provide pattern matching queries that may locate occurrences of a specified pattern p in the strings in our collection in optimal O(|p|) time, at the expense of increasing update time to O(log n). Second, we show how to maintain a history of an edited text, processing updates in O(log t log log t) time, where t is the number of edits, and support pattern matching queries against the whole history in O(|p| log t log log t) time. ar X iv :1 51 1. 02 61 2v 1 [ cs .D S] 9 N ov 2 01 5
منابع مشابه
Optimal String Mining Under Frequency Constraints
We propose a new algorithmic framework that solves frequency-related data mining queries on databases of strings in optimal time, i.e., in time linear in the input and the output size. The additional space is linear in the input size. Our framework can be used to mine frequent strings, emerging strings and strings that pass other statistical tests, e.g., the χ-test. In contrast to the presented...
متن کاملOn the optimality of parsing in dynamic dictionary based data compression preliminary version
Since the introduction of dynamic dictionary based data compression by Ziv and Lempel two decades ago many dictionary construction schemes have been proposed and implemented This paper considers the following question once a dynamic dictionary construction scheme is selected is there an e cient dynamic parsing method that results with the smallest number of phrases possible for the selected sch...
متن کاملHead Automata and Bilingual Tiling: Translation with Minimal Representations
We present a language model consisting of a collection of costed bidirectional finite state automata associated with the head words of phrases. The model is suitable for incremental application of lexical associations in a dynamic programming search for optimal dependency tree derivations. We also present a model and algorithm for machine translation involving optimal “tiling” of a dependency t...
متن کاملMount CMSC 451 : Lecture 11 Dynamic Programming : Longest Common
Strings: One important area of algorithm design is the study of algorithms for character strings. Finding patterns or similarities within strings is fundamental to various applications, ranging from document analysis to computational biology. One common measure of similarity between two strings is the lengths of their longest common subsequence. Today, we will consider an efficient solution to ...
متن کاملOn Minimal Strings Containing the Elements of Sn by Decimation
One of the most fundamentals objects of combinatorics is perhaps Sn – the set of all permutations of an alphabet of size n. This paper investigates a new problem concerning Sn the strings containing Sn by decimation. Decimation means that each permutation is formed by deleting entries in the string. This investigation has focused on strings of minimal length satisfying the decimation constraint...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2018